This session is about the basics of stochastic actor-oriented models (SAOMs). We refer to the RSiena page https://www.stats.ox.ac.uk/~snijders/siena/ for further resources. Here we will
It is appropriate that we simulate the model as Siena stands for
Simulation Investigation for Empirical Network Analysis
We will use functionality from the network packages sna
and network (see https://raw.githubusercontent.com/johankoskinen/CHDH-SNA/main/Markdowns/CHDH-SNA-2.Rmd).
The main packages for SAOM is RSiena.
First time you use them, install the packages (uncomment the install commmands)
# install.packages("sna")
# install.packages("network")
# install.packages("RSiena")
Once packages are installed, load them
library("sna")
library("network")
library('RSiena')
The RSiena package automatically loads the van de Bunt’s Freshman dataset (see Data description). We will use time-point 3 and 4
?tmp3# opens the helpfile with information about the dataset
class(tmp3)# both time-points are regular adjacency matrices
class(tmp4)
dim(tmp3)# 32 students by 32
n <- dim(tmp4)[1]
The original tie-variables had 6 levels but have here been dichotomised. It is good practice to check that ties are binary
table(tmp3,useNA = 'always')
table(tmp4,useNA = 'always')
RSiena handles missing data in estimation, but for the purposes of simulating and investigating the network, replace missing values
tmp3[is.na(tmp3)] <- 0 # remove missing
tmp4[is.na(tmp4)] <- 0 # remove missing
Plotting the networks with nodes in the same places illustrates what the SAOM will try to model
par(mfrow = c(1,2))
coordin <- plot(as.network(tmp3), main='Time 3')
plot(as.network(tmp4),coord=coordin, main='Time 4')
Looking closely we see that some arcs have been added and others been removed
In this session, we will assume that we have one initial network \(X(t_0)=x_{obs}(t_0)\), at time \(t_0\) and that we want to say something about a network \(X(t_1)\), at time \(t_1\), \(t_0<t_1\).
We will only use two waves but because of the Markov property of the model, all ingredients extend without limitations to several waves.
To simulate (but also estimate) we need to execute, in turn, each of the functions
sienaDependent - formats class matrix to
class sienaDependentsienaDataCreate - formats data to class
sienagetEffects - provides us with the effects we
can use for the processsienaAlgorithmCreate - sets up simulation/estimation
settingsAfter these steps, the model is simulated/estimated using
siena07
The function sienaDependent formats and translates the
two adjacency matrices to a “sienaDependent”
object:
mynet1 <- sienaDependent(array(c(tmp3, tmp4), # the matrices in order
dim=c(32, 32,2))# are set as two slices in an array
)
For networks, sienaDependent takes networks clued
together in an \(n \times n \times\)
number of waves, array.
You can define, as dependent variables, one-mode networks, two-mode networks, or dependent mondadic variables
Once you have defined your variables, these are combined into a
siena object using sienaDataCreate
mydata <- sienaDataCreate(mynet1)
The siena object adds all relevant information about the data
mydata
## Dependent variables: mynet1
## Number of observations: 2
##
## Nodeset Actors
## Number of nodes 32
##
## Dependent variable mynet1
## Type oneMode
## Observations 2
## Nodeset Actors
## Densities 0.15 0.18
To determined what effects are available for our combination of different types of data
myeff <- getEffects(mydata)
Assume a model where an actor \(i\), when they make a change, only cares about how many ties they have.
myeff <- includeEffects(myeff, recip,include=FALSE)# remove reciprocity which is DEFAULT
## effectName shortName include fix test initialValue parm
## 1 reciprocity recip FALSE FALSE FALSE 0 0
myeff$initialValue[myeff$shortName=='Rate'] <- 5.7288
myeff$initialValue[myeff$shortName=='density'][1] <- -0.7349
For later reference, notice how
myeffis on the right-hand side of the allocation ofincludeEffects
What does the rate \(\lambda = 5.7288\) mean?
Each time the network has changed (or an oppotunity to change), each actor waits \(T_i \overset{i.i.d.}{\thicksim} Exp(\lambda)\) time.
The actor that gets to change is the one who waits for the shortest amount of time
waiting <- rexp(32, 5.7288)
hist(waiting)
winner <- which( waiting == min(waiting))
paste('The winner is actor: ', winner)
## [1] "The winner is actor: 6"
In the example of \(i=1\), deciding between creating a tie to \(j=2,4\) or breaking the tie to \(j=3\)
With our current model
\[ \Pr(X(1\leadsto 2)=\Pr(X(1\leadsto 4)=\frac{e^{\beta_1(1-2x_{12})}}{1+\sum_{j\neq i} e^{\beta(1-2x_{ij})}}=\frac{\exp(-0.7349)}{\exp(-0.7349)+\exp(0.7349)+\exp(-0.7349)+1} \] and
\[ \Pr(X(1\leadsto 3)=\frac{e^{\beta_1(1-2x_{13})}}{1+\sum_{j\neq i} e^{\beta(1-2x_{ij})}}=\frac{\exp(0.7349)}{\exp(-0.7349)+\exp(0.7349)+\exp(-0.7349)+1} \] The conditional probability of \(i=1\) creating the tie to \(j=2\) is thus
exp(-0.7349)/(exp(-0.7349)+exp(0.7349)+exp(-0.7349)+1)
## [1] 0.1185728
and the conditional probability of \(i=1\) deleting the tie to \(j=3\) is thus
exp(0.7349)/(exp(-0.7349)+exp(0.7349)+exp(-0.7349)+1)
The function sienaAlgorithmCreate determines the
simulation/estimation settings. Here we are going to
simOnly = TRUEn3 = 100Networks, starting in \(X(t_3)\)
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',# name will be appended to output
cond = FALSE, # NOT conditioning on num. changes
useStdInits = FALSE,# we are changing some defaults
nsub = 0 ,# more about subphases in estimation
n3 = 100,# number of simulations (in Phase 3)
simOnly = TRUE)# we only want to simulate
We are ready to simulate!
The function siena07 is the main engine of RSiena and is
used to estimate all models (apart from hierarchical SAOMS)
sim_ans <- siena07( sim_model,# the name of our model
data = mydata,# all of our data - see above for what is in there
effects = myeff,# the effects we have chosen, including parameters
returnDeps = TRUE,# save simulated networks
batch=TRUE )# batch=FALSE opens a GUI
The networks are in sim_ans$sims but to help with
extracting and formatting them we read in the function
reshapeRSienaDeps
source("https://raw.githubusercontent.com/johankoskinen/CHDH-SNA/main/data/reshapeRSienaDeps.R")
We apply it on sim_ans
mySimNets <- reshapeRSienaDeps(sim_ans,n)# n is the number of nodes (defined above)
The object mySimNets is an 100 by 32 by 32
array with 9 adjacency matrices
Plot networks with the same layout as for \(X(t_3)\) above
par(mfrow=c(2,5),# want 2 by 5 panels
oma = c(0,1,0,0) + 0.1,# custom outer margins
mar = c(1,0,0,0) + 0.1)# custom inner margins
plot(as.network(tmp4),# observed network at t4
coord=coordin) # observed network
invisible(# need to supress printing to consol
apply(mySimNets[1:9,,],# for the first 9 networks in array
1,# along first dimenstion, apply the following function
function(x)
plot(as.network(x),coord=coordin) ) )
It is hard to spot any differences. Plot density of the networks
hist( gden(mySimNets ) )
abline( v = gden(tmp4), col='red')
If actors only care about how many ties they have, we predict the density at \(t_4\) correctly
How about the number of reciprocated ties \(x_{ij}x_{ji}=1\)
hist( dyad.census(mySimNets)[,1] ,xlim=c(9,50))
abline(v = dyad.census(tmp4)[1], col='red')
Clearly, if the only rule actors have is to care about the number of ties they have, this is not enough to explain why there are so many (46) reciprocal dyads
Simulate assuming that actors want to have reciprocated ties to others with the model with probabilities \(\Pr(X(i \leadsto j) |i \text{ makes decision})\): \[ p_{ij}(X,\beta)=\frac{e^{\beta_d (1-2x_{ij}) +\beta_r (1-2x_{ij})x_{ji} } }{ \sum_h e^{\beta_d (1-2x_{ih}) +\beta_r (1-2x_{ih})x_{hj} } } \]
with parameters \(\beta_d=-1.1046\) and \(\beta_r=-1.2608\):
myeff <- includeEffects(myeff, recip,include=TRUE)# reinstate reciprocity which is DEFAULT
## effectName shortName include fix test initialValue parm
## 1 reciprocity recip TRUE FALSE FALSE 0 0
myeff$initialValue[myeff$shortName =='recip'][1] <- 1.2608
#### === We also need to change the other values
myeff$initialValue[myeff$shortName=='Rate'] <- 6.3477
myeff$initialValue[myeff$shortName=='density'][1] <- -1.1046
Log odds for a new graph
| \(x_{ji}\) | |||
|---|---|---|---|
| \(x_{ij}\) | 0 | 1 | |
| 0 | 0 | \(\beta_d\) | |
| 1 | \(\beta_d\) | \(\beta_d+\beta_r\) |
Set up the algorithm
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',
cond = FALSE,
useStdInits = FALSE,
nsub = 0 ,
n3 = 100,
simOnly = TRUE)
Run simulation
sim_ans <- siena07( sim_model, data = mydata,
effects = myeff,
returnDeps = TRUE, batch=TRUE )
Reshape and extract networks:
mySimNets <- reshapeRSienaDeps(sim_ans,n)
Plot, using the same code as before
par(mfrow=c(2,5),# want 2 by 5 panels
oma = c(0,1,0,0) + 0.1,# custom outer margins
mar = c(1,0,0,0) + 0.1)# custom inner margins
plot(as.network(tmp4),# observed network at t4
coord=coordin) # observed network
invisible(# need to supress printing to consol
apply(mySimNets[1:9,,],# for the first 9 networks in array
1,# along first dimenstion, apply the following function
function(x)
plot(as.network(x),coord=coordin) ) )
Compare the observed number of reciprocated dyads to the simulated number of reciprocated dyads
hist( dyad.census(mySimNets)[,1])
abline(v = dyad.census(tmp4)[1], col='red')
With an actor preference of 1.26 for having their ties reicprocated we reproduce the reciprocity
What about the triad cencus? Check the first 9
triad.census(tmp4)# observed
triad.census(mySimNets[1:9,,])# first 9 simulated networks
Look at the distributions for transitive triads and dense triads
par(mfrow=c(1,2))
hist( triad.census(mySimNets)[,9], xlim=c(8,40))# column 9 is 030T
abline(v = triad.census(tmp4)[9], col='red')
hist( triad.census(mySimNets)[,16], xlim=c(4,24))# column 16 is 300
abline(v = triad.census(tmp4)[16], col='red')
A prefence for reciprocity is not enough to explain why reciprocated dyads hang together in triangles or why there are so many transitive triads
Add another positive parameter \(\beta_t\) for the preference for closing open triads.
myeff <- includeEffects(myeff, recip,include=TRUE) # add the effect reciprocated ties
myeff <- includeEffects(myeff, transTrip,include=TRUE)
myeff$initialValue[myeff$shortName=='Rate'] <- 7.0959
myeff$initialValue[myeff$shortName=='density'][1] <- -1.6468
myeff$initialValue[myeff$shortName=='recip'][1] <- 0.8932
myeff$initialValue[myeff$shortName=='transTrip'][1] <- 0.2772
Set up the algorithm
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',
cond = FALSE,
useStdInits = FALSE,
nsub = 0 ,
n3 = 100,
simOnly = TRUE)
Run simulation
sim_ans <- siena07( sim_model, data = mydata,
effects = myeff,
returnDeps = TRUE, batch=TRUE )
Reshape and extract networks:
mySimNets <- reshapeRSienaDeps(sim_ans,n)
What about the triad cencus? Check the first 9
triad.census(tmp4)# observed
triad.census(mySimNets[1:9,,])# first 9 simulated networks
Look at the distributions for transitive triads and dense triads
par(mfrow=c(1,2))
hist( triad.census(mySimNets)[,9], xlim=c(2,55))# column 9 is 030T
abline(v = triad.census(tmp4)[9], col='red')
hist( triad.census(mySimNets)[,16], xlim=c(4,86))# column 16 is 300
abline(v = triad.census(tmp4)[16], col='red')
Having preferences for reciprocated ties and transitive ties seem to explain ‘most’ of the structure
How did I pick the numbers
Manually, you could
Luckily, we have an algorithm (stochastic approximation) for automating this
We define the model in the same way as when we simulated data
myeff <- getEffects(mydata)# We already have our effects, but let's start from scratch
myeff <- includeEffects(myeff, recip,include=TRUE) # add the effect reciprocated ties (it is alredy include by DEFAULT though)
myeff <- includeEffects(myeff, transTrip,include=TRUE)# we want to estimate the trasitivity effect
Set up the algorithm with default values
(siena07 will assume you want to estimate)
est_model <- sienaAlgorithmCreate(
projname = 'est_model',
n3 = 1000,# number of simulations in Phase 3 (default *is* 1000)
simOnly = FALSE,# default *is* FALSE
cond = FALSE,
useStdInits = FALSE)
Estimate!
est_ans <-siena07(est_model,# algorithm object
data = mydata, # the data object we created earlier
effects = myeff,# same effects as in simulation
returnDeps = TRUE,
batch=TRUE)
Estimation gives us an ANOVA table with - Parameter (point) estimates - Standard errors of estimates - Convergence statistis
summary( est_ans )
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
## 1. rate basic rate parameter mynet1 7.0173 ( 0.8899 ) -0.0627
## 2. eval outdegree (density) -1.6416 ( 0.1302 ) -0.0570
## 3. eval reciprocity 0.9036 ( 0.1993 ) -0.0389
## 4. eval transitive triplets 0.2733 ( 0.0469 ) -0.0633
##
## Overall maximum convergence ratio: 0.0812
##
##
## Total of 2016 iteration steps.
##
## Covariance matrix of estimates (correlations below diagonal)
##
## 0.792 -0.004 0.003 -0.009
## -0.036 0.017 -0.006 -0.004
## 0.019 -0.250 0.040 -0.002
## -0.226 -0.611 -0.254 0.002
##
## Derivative matrix of expected statistics X by parameters:
##
## 13.512 9.039 7.303 75.896
## 68.906 212.472 144.053 1166.307
## 21.404 74.457 89.392 488.667
## 180.351 572.859 471.741 4400.670
##
## Covariance matrix of X (correlations below diagonal):
##
## 132.839 122.691 91.078 857.644
## 0.571 347.807 264.039 2217.894
## 0.492 0.882 257.756 1874.640
## 0.570 0.911 0.894 17058.503
These estimates look very much like the numbers that I picked arbitrarily - what makes these better
The observed statistics that we want to ‘hit’ on average are
est_ans$targets
## [1] 125 175 92 431
## attr(,"fromData")
## [1] TRUE
# the same as sim_ans$targets
The simulated statistics for the parameters from the estimation are
colMeans(est_ans$sf2)
## [,1] [,2] [,3] [,4]
## [1,] 124.277 173.937 91.376 422.727
# also provided in:
# est_ans$estMeans
The simulated statistics for the parameters from the simulation are
sim_ans$estMeans
## [1] 125.21210 175.25102 92.80872 433.96574
## attr(,"fromData")
## [1] TRUE
We can calculate within how many standard deviation units of the targets we are for each statistic \[ \frac{\bar{z}_k-z_{obs}}{sd(z_k)} \] The deviations using the estimates from estimation:
(colMeans(est_ans$sf2)-est_ans$targets)/sqrt(diag(var(est_ans$sf2[,1,])))
## [,1] [,2] [,3] [,4]
## [1,] -0.06273 -0.0569986 -0.0388669 -0.06334212
## attr(,"fromData")
## [1] TRUE
# Also provided in:
# est_ans$tstat
For estimates from simulation:
(colMeans(sim_ans$sf2)-sim_ans$targets)/sqrt(diag(var(sim_ans$sf2[,1,])))
For both sets of parameters, the simulated statistics are on average within less that 0.1 standard deviations units of the observed statistics
As a quick illustration, we can see when we set rate, density, and reciprocity to the estimated values
myeff$initialValue[myeff$shortName=='Rate'] <- est_ans$theta[1]
myeff$initialValue[myeff$shortName=='density'][1] <- est_ans$theta[2]
myeff$initialValue[myeff$shortName=='recip'][1] <-est_ans$theta[3]
but pick another value for transitivity:
myeff$initialValue[myeff$shortName=='transTrip'][1] <- 0.1
and then, simulate!
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',
cond = FALSE,
useStdInits = FALSE,
nsub = 0 ,
n3 = 1000,
simOnly = TRUE)
sim_ans <- siena07( sim_model, data = mydata,
effects = myeff,
returnDeps = TRUE, batch=TRUE )
Calculate the scaled difference between the average statistics and the observed statistics again
(colMeans(sim_ans$sf2)-sim_ans$targets)/sqrt(diag(var(sim_ans$sf2[,1,])))
The everage simulated statistics:
sim_ans$estMeans
are not close to the observed
sim_ans$targets
Decreasing the transitivity parameter results in networks having too few transitive triads
In general, we say that the estimation has converged and estimates can be estimated if
The aim of the estimation algorithm is to find estimates \(\hat{\theta}\), such that \[
\underbrace{E_{\hat{\theta}}\{ Z[X(t_1)] \mid
X(t_0)=x_{obs}(t_0)\}}_{\text{'average'
statistics}}=Z[x_{obs}(t_1)]
\] The stochastic approximation algorithm in siena07
solves this equation computationally in three
Phases:
You may notice that the difference that is minimised in this algoritm is not \(\bar{z}-z_{obs}\). Only one network is simulated in each interation - but it works (the other way also works but is less efficient)
The second column in the ANOVA table contains the nominal standard errors, i.e. the approximate standard deviations of the estimators of the \(\theta\)’s. Typically, these are used for standard hypothesis tests:
For effect/parameter \(k\), test \[ H_0:\beta_k=\beta_{k,0}=0 \] against \[ H_0:\beta_k\neq 0 \] The test statistic \[ \frac{\hat{\beta}_k-\beta_{k,0} }{sd(\hat{\beta}_k)}=\frac{\hat{\beta}_k }{sd(\hat{\beta}_k)}\approx \frac{\hat{\beta}_k }{se(\hat{\beta}_k)}, \] is approximately normally distributed \(N(0,1)\), if \(H_0\) is true.
Given the number of assumptions and approximations we need to make, I would advice that we stick to testing on the nominal 5% level, and reject \(H_0\) when the test statistic is greater than \(2\) in absolute value.
In the simple model we estimated, let us test \(H_0:\beta_t=0\). The observed test statistic
est_ans$theta[4]/est_ans$se[4]
## [1] 5.824495
is considerably larger than 2, and we can reject the null-hypothesis that the true value of the transitivity parameter is \(0\).
Intuitively, we could also test if we ‘need’ \(\beta_t\), by estimating the model with \(\beta_t=0\), and check the convergence t-statistic. You can do this yourself now!
There are many different types of covariates
| Type | RSiena type |
|---|---|
| Constant monadic covariates | coCovar |
| Changing monadic covariates | varCovar |
| Constant dyadic covariate | coDyadCovar |
| Changing dyadic covariate | varDyadCovar |
| Changing (covariate) network | sienaDependent |
The usual nomenclauture for measurement scales, and the distinction between metric and non-metric variables, applies.
The adjacency matrices s501, s502, and
s503, for the so-called s-50 dataset, the network of 50
girls, are also available with RSiena. For a full
description of this dataset see ?s50 or http://www.stats.ox.ac.uk/~snijders/siena/s50_data.htm
For clarity, we rename the adjacency matrices
friend.data.w1 <- s501
friend.data.w2 <- s502
friend.data.w3 <- s503
Note that we have three (3) waves.
Among the many monadic covariates are ‘smoking’ and ‘drinking’:
drink <- s50a
smoke <- s50s
Here, each variable had its own \(n \times 3\) data matrix, one observation for each individual and each time-point
head(smoke)
We are going to test \[ H_0: \text{smokers have as many friends as non-smokers} \] Against an alternative hypothesis stating that there is a difference. We will interpret this as “sending as many ties” and “receiving as many ties” as non-smokers.
We will use smoking at \(t_0\) as our explanatory variable and define ‘smoking’ as a value of 2 or 3.
smoke1.matrix <- as.numeric( smoke[,1]>1 )
table(smoke1.matrix, useNA='always')
To tell RSiena how to use this variable, we format it
smoke1 <- coCovar( smoke1.matrix )
Print to screen to see how R has interpreted the variable
smoke1
Format the dependent network variable as before:
friendshipData <- array( c( friend.data.w1,
friend.data.w2,
friend.data.w3 ),
dim = c( 50, 50, 3 ) )
friendship <- sienaDependent(friendshipData)
Using sienaDataCreate, we give RSiena the
oppotunity to figure out how data are structured and what types of
effects can be calculated from it
s50data <- sienaDataCreate( friendship, smoke1)
Since we have both a network as well as monadic covariates, we will have more effects avaialble to us that previously
s50.effects <- getEffects(s50data)
For testing our hypothesis we want a statistic that corresponds to \[ v_i x_{ij} \] for each tie-variable, and where \(v_i\) is one or zero according to whether \(i\) is a smoker or not, respectively. This is the sender effect
s50.effects <- includeEffects( s50.effects,# we "add and effect" to our effects object
egoX,# the shortname here evokes that variable for 'ego' affects decision
interaction1 = "smoke1" # the variable we want to interact the DV with
)
Next, we want a statistic that corresponds to \[ v_j x_{ij} \] so that if the corresponding parameter is positive, then actors are more likely to form (maintain) ties to actors \(j\) that have \(v_j=1\), i.e. are smokers
s50.effects <- includeEffects( s50.effects,
altX,# the shortname here evokes that variable for 'alter' affects decision of ego
interaction1 = "smoke1" # the variable we want to interact the DV with
)
For the vanDeBunt dataset, we saw that triadic closure was important. We can chose to include it because we believe that it is important but also as a control for our hypothesised effects
s50.effects <- includeEffects(s50.effects,# We "add and effect" to s50.effects
transTrip,# shortname of the effect
include=TRUE)
Our specified model is
s50.effects
## effectName include fix test initialValue parm
## 1 constant friendship rate (period 1) TRUE FALSE FALSE 4.69604 0
## 2 constant friendship rate (period 2) TRUE FALSE FALSE 4.32885 0
## 3 outdegree (density) TRUE FALSE FALSE -1.46770 0
## 4 reciprocity TRUE FALSE FALSE 0.00000 0
## 5 transitive triplets TRUE FALSE FALSE 0.00000 0
## 6 smoke1 alter TRUE FALSE FALSE 0.00000 0
## 7 smoke1 ego TRUE FALSE FALSE 0.00000 0
Specify the simulation settings
s50algorithm.simple <- sienaAlgorithmCreate( projname = 's50_simple' )# We are only using defaults
Estimating the model with default settings is simply a callt to siena07
s50.simple.ans <- siena07( s50algorithm.simple,# estimation settings
data = s50data,# data object that knows DV and IV
effects = s50.effects,# the effects we specified
batch = TRUE,
returnDeps = TRUE )
Print the results to screen
summary( s50.simple.ans )
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
##
## Rate parameters:
## 0.1 Rate parameter period 1 6.4349 ( 1.0987 )
## 0.2 Rate parameter period 2 5.1974 ( 0.9066 )
##
## Other parameters:
## 1. eval outdegree (density) -2.6764 ( 0.1154 ) -0.0331
## 2. eval reciprocity 2.4479 ( 0.1916 ) -0.0550
## 3. eval transitive triplets 0.6220 ( 0.0726 ) -0.0363
## 4. eval smoke1 alter -0.0044 ( 0.1852 ) -0.0311
## 5. eval smoke1 ego 0.0534 ( 0.1847 ) -0.0608
##
## Overall maximum convergence ratio: 0.0987
##
##
## Total of 2028 iteration steps.
##
## Covariance matrix of estimates (correlations below diagonal)
##
## 0.013 -0.013 -0.004 0.000 -0.001
## -0.577 0.037 -0.002 -0.001 0.005
## -0.448 -0.128 0.005 -0.001 -0.002
## 0.022 -0.028 -0.075 0.034 -0.011
## -0.057 0.140 -0.135 -0.312 0.034
##
## Derivative matrix of expected statistics X by parameters:
##
## 277.420 217.440 481.290 10.525 8.732
## 120.349 131.633 228.620 3.482 1.581
## 340.918 310.096 1161.101 23.951 24.383
## 14.368 11.182 43.131 43.645 24.648
## 12.212 7.386 41.898 26.779 41.385
##
## Covariance matrix of X (correlations below diagonal):
##
## 452.405 388.484 1077.945 21.118 17.243
## 0.930 385.851 1022.004 16.567 13.924
## 0.799 0.821 4020.382 60.975 57.664
## 0.125 0.107 0.121 62.662 45.975
## 0.110 0.096 0.123 0.785 54.704
Are all convergence statistics are less than 0.1 in absolute value? If so we can test our hypothesis - do we reject our \(H_0\)?
To account for (test) possible assortativity on smoking, we add the homophily effect:
s50.effects <- includeEffects( s50.effects,# we "add and effect" to our effects object
sameX,# the shortname
interaction1 = "smoke1" # the variable we want to interact the DV with
)
s50.effects <- includeEffects( s50.effects,egoXaltX,interaction1 = "smoke1",include=FALSE)
s50.hom.ans <- siena07( s50algorithm.simple,# estimation settings
data = s50data,# data object that knows DV and IV
effects = s50.effects,# the effects we specified
batch = TRUE,
returnDeps = TRUE )
Print the results to screen
summary( s50.hom.ans )
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
##
## Rate parameters:
## 0.1 Rate parameter period 1 6.4430 ( 1.0964 )
## 0.2 Rate parameter period 2 5.2893 ( 0.9084 )
##
## Other parameters:
## 1. eval outdegree (density) -2.9046 ( 0.1637 ) 0.0356
## 2. eval reciprocity 2.4388 ( 0.1924 ) 0.0533
## 3. eval transitive triplets 0.6160 ( 0.0737 ) 0.0657
## 4. eval smoke1 alter 0.0945 ( 0.1874 ) -0.0293
## 5. eval smoke1 ego 0.1488 ( 0.1977 ) -0.0401
## 6. eval same smoke1 0.3245 ( 0.1633 ) 0.0673
##
## Overall maximum convergence ratio: 0.1157
##
##
## Total of 2258 iteration steps.
##
## Covariance matrix of estimates (correlations below diagonal)
##
## 0.027 -0.012 -0.004 -0.003 -0.007 -0.019
## -0.395 0.037 -0.002 -0.001 0.003 -0.001
## -0.294 -0.113 0.005 -0.002 -0.001 -0.001
## -0.091 -0.016 -0.123 0.035 -0.011 0.007
## -0.205 0.070 -0.067 -0.286 0.039 0.008
## -0.696 -0.043 -0.057 0.244 0.245 0.027
##
## Derivative matrix of expected statistics X by parameters:
##
## 287.915 228.107 501.808 -3.357 -0.315 218.832
## 126.033 137.763 253.948 1.169 0.525 96.396
## 359.498 329.323 1221.325 11.432 10.226 275.739
## -2.637 -0.540 12.238 47.028 29.777 -20.610
## 1.783 2.370 23.493 29.581 42.286 -16.948
## 220.882 179.559 390.495 -19.899 -16.401 218.553
##
## Covariance matrix of X (correlations below diagonal):
##
## 477.668 412.962 1184.749 1.990 1.951 370.758
## 0.930 412.982 1150.475 8.900 7.575 322.447
## 0.804 0.840 4541.848 25.824 25.803 929.114
## 0.011 0.053 0.046 68.803 55.009 -24.409
## 0.011 0.047 0.048 0.840 62.378 -23.102
## 0.913 0.854 0.742 -0.158 -0.157 345.439
s50.hom.ans.2 <- siena07( s50algorithm.simple,# estimation settings
data = s50data,# data object that knows DV and IV
effects = s50.effects,# the effects we specified
batch = TRUE,
returnDeps = TRUE,
prevAns = s50.hom.ans)
If
we can test our hypothesis, controlling for possible assortativity/homophily
What about homophily on smoking - do smokers befried other smokers?
How do we know if the model is any good?
You may simulate from the model to see if it replicates data
If you estimate a model with siena07 and set
returnDeps=TRUE, the answer object returns the simulations
from Phase 3. Let us try with the model for s50 from
CHDH-SNA-3.Rmd.
friend.data.w1 <- s501# s50 wave 1
friend.data.w2 <- s502# s50 wave 2
friend.data.w3 <- s503# s50 wave 3
smoke <- s50s# s50 smoking wave 1,2, and 3
smoke1.matrix <- as.numeric( smoke[,1]>1 )# dichotomise
smoke1 <- coCovar( smoke1.matrix )# format as constant covariate
friendshipData <- array( c( friend.data.w1,
friend.data.w2,
friend.data.w3 ),
dim = c( 50, 50, 3 ) )# paste together adjacency matrices in array
friendship <- sienaDependent(friendshipData)# designate network as dependent variable
s50data <- sienaDataCreate( friendship, smoke1) # create the siena data object
s50.effects <- getEffects(s50data) # create an effects structure
s50.effects <- includeEffects(s50.effects,# We "add an effect" to s50.effects
transTrip,# shortname of the effect
include=TRUE)
s50.effects <- includeEffects( s50.effects,# we "add an effect" to our effects object
egoX,# the shortname here evokes that variable for 'ego' affects decision
interaction1 = "smoke1" # the variable we want to interact the DV with
)
s50.effects <- includeEffects( s50.effects,
altX,# the shortname here evokes that variable for 'alter' affects decision of ego
interaction1 = "smoke1" # the variable we want to interact the DV with
)
s50.effects <- includeEffects( s50.effects,egoXaltX,interaction1 = "smoke1",include=TRUE)# this is homophily on smoking
s50algorithm.simple <- sienaAlgorithmCreate( projname = 's50_simple' )# We are only using defaults
Now estimate
s50.hom.ans <- siena07( s50algorithm.simple,# estimation settings
data = s50data,# data object that knows DV and IV
effects = s50.effects,# the effects we specified
batch = TRUE,
returnDeps = TRUE )# IMPORTANT: we want to have access to networks simulated in Phase 3
##
## Start phase 0
## theta: -1.47 0.00 0.00 0.00 0.00 0.00
##
## Start phase 1
## Phase 1 Iteration 1 Progress: 0%
## Phase 1 Iteration 2 Progress: 0%
## Phase 1 Iteration 3 Progress: 0%
## Phase 1 Iteration 4 Progress: 0%
## Phase 1 Iteration 5 Progress: 0%
## Phase 1 Iteration 10 Progress: 0%
## Phase 1 Iteration 15 Progress: 1%
## Phase 1 Iteration 20 Progress: 1%
## Phase 1 Iteration 25 Progress: 1%
## Phase 1 Iteration 30 Progress: 1%
## Phase 1 Iteration 35 Progress: 1%
## Phase 1 Iteration 40 Progress: 1%
## Phase 1 Iteration 45 Progress: 2%
## Phase 1 Iteration 50 Progress: 2%
## theta: -1.7159 0.3763 0.3118 0.0182 0.0016 0.1398
##
## Start phase 2.1
## Phase 2 Subphase 1 Iteration 1 Progress: 12%
## Phase 2 Subphase 1 Iteration 2 Progress: 12%
## theta -1.8657 0.6637 0.5193 0.0213 0.0276 0.2348
## ac 0.696 1.395 2.067 -0.606 0.866 2.336
## Phase 2 Subphase 1 Iteration 3 Progress: 12%
## Phase 2 Subphase 1 Iteration 4 Progress: 12%
## theta -2.1598 1.4074 0.5979 0.0317 0.1380 0.4926
## ac 1.897 0.816 -0.493 0.659 0.944 2.276
## Phase 2 Subphase 1 Iteration 5 Progress: 12%
## Phase 2 Subphase 1 Iteration 6 Progress: 12%
## theta -2.3800 2.0104 0.4857 -0.0925 0.1297 0.6382
## ac 2.132 0.574 -1.974 0.579 1.106 2.194
## Phase 2 Subphase 1 Iteration 7 Progress: 12%
## Phase 2 Subphase 1 Iteration 8 Progress: 12%
## theta -2.541 2.390 0.413 -0.142 0.127 0.693
## ac 1.433 0.267 -1.971 0.645 1.142 0.285
## Phase 2 Subphase 1 Iteration 9 Progress: 12%
## Phase 2 Subphase 1 Iteration 10 Progress: 12%
## theta -2.6708 2.6171 0.4055 -0.2373 0.0472 0.7251
## ac 0.6494 0.1315 -1.6781 0.5679 1.0064 -0.0937
## Phase 2 Subphase 1 Iteration 1 Progress: 12%
## Phase 2 Subphase 1 Iteration 2 Progress: 12%
## Phase 2 Subphase 1 Iteration 3 Progress: 12%
## Phase 2 Subphase 1 Iteration 4 Progress: 12%
## Phase 2 Subphase 1 Iteration 5 Progress: 12%
## Phase 2 Subphase 1 Iteration 6 Progress: 12%
## Phase 2 Subphase 1 Iteration 7 Progress: 12%
## Phase 2 Subphase 1 Iteration 8 Progress: 12%
## Phase 2 Subphase 1 Iteration 9 Progress: 12%
## Phase 2 Subphase 1 Iteration 10 Progress: 12%
## theta -2.69690 2.48042 0.59181 -0.08048 0.00803 0.61622
## ac -0.7139 -0.6910 -0.6487 -0.2255 -0.2997 -0.0865
## theta: -2.69690 2.48042 0.59181 -0.08048 0.00803 0.61622
##
## Start phase 2.2
## Phase 2 Subphase 2 Iteration 1 Progress: 20%
## Phase 2 Subphase 2 Iteration 2 Progress: 20%
## Phase 2 Subphase 2 Iteration 3 Progress: 20%
## Phase 2 Subphase 2 Iteration 4 Progress: 20%
## Phase 2 Subphase 2 Iteration 5 Progress: 20%
## Phase 2 Subphase 2 Iteration 6 Progress: 20%
## Phase 2 Subphase 2 Iteration 7 Progress: 20%
## Phase 2 Subphase 2 Iteration 8 Progress: 20%
## Phase 2 Subphase 2 Iteration 9 Progress: 20%
## Phase 2 Subphase 2 Iteration 10 Progress: 20%
## theta -2.6944 2.4645 0.6016 -0.0819 0.0380 0.6127
## ac -0.2653 -0.4414 -0.4894 -0.3572 -0.1414 -0.0522
## theta: -2.6944 2.4645 0.6016 -0.0819 0.0380 0.6127
##
## Start phase 2.3
## Phase 2 Subphase 3 Iteration 1 Progress: 29%
## Phase 2 Subphase 3 Iteration 2 Progress: 29%
## Phase 2 Subphase 3 Iteration 3 Progress: 29%
## Phase 2 Subphase 3 Iteration 4 Progress: 29%
## Phase 2 Subphase 3 Iteration 5 Progress: 29%
## Phase 2 Subphase 3 Iteration 6 Progress: 29%
## Phase 2 Subphase 3 Iteration 7 Progress: 29%
## Phase 2 Subphase 3 Iteration 8 Progress: 29%
## Phase 2 Subphase 3 Iteration 9 Progress: 29%
## Phase 2 Subphase 3 Iteration 10 Progress: 29%
## theta -2.6936 2.4525 0.6068 -0.0789 -0.0181 0.6312
## ac -0.329 -0.336 -0.338 -0.200 -0.159 -0.141
## theta: -2.6936 2.4525 0.6068 -0.0789 -0.0181 0.6312
##
## Start phase 2.4
## Phase 2 Subphase 4 Iteration 1 Progress: 43%
## Phase 2 Subphase 4 Iteration 2 Progress: 43%
## Phase 2 Subphase 4 Iteration 3 Progress: 43%
## Phase 2 Subphase 4 Iteration 4 Progress: 43%
## Phase 2 Subphase 4 Iteration 5 Progress: 43%
## Phase 2 Subphase 4 Iteration 6 Progress: 43%
## Phase 2 Subphase 4 Iteration 7 Progress: 43%
## Phase 2 Subphase 4 Iteration 8 Progress: 43%
## Phase 2 Subphase 4 Iteration 9 Progress: 43%
## Phase 2 Subphase 4 Iteration 10 Progress: 43%
## theta -2.6942 2.4302 0.6134 -0.0746 -0.0206 0.6573
## ac -0.0705 -0.0800 -0.1120 -0.0932 -0.0111 -0.0830
## theta: -2.6942 2.4302 0.6134 -0.0746 -0.0206 0.6573
##
## Start phase 3
## Phase 3 Iteration 500 Progress 83%
## Phase 3 Iteration 1000 Progress 100%
#### REESTIMATE:
s50.hom.ans <- siena07( s50algorithm.simple,# estimation settings
data = s50data,# data object that knows DV and IV
effects = s50.effects,# the effects we specified
batch = TRUE,
returnDeps = TRUE ,prevAns = s50.hom.ans)
##
## Start phase 0
## theta: -2.6942 2.4302 0.6134 -0.0746 -0.0206 0.6573
##
## Start phase 2.1
## Phase 2 Subphase 1 Iteration 1 Progress: 12%
## Phase 2 Subphase 1 Iteration 2 Progress: 12%
## theta -2.7172 2.4439 0.6205 -0.0734 -0.0755 0.6634
## ac 0.7217 0.1262 -0.0507 -0.8844 -0.9030 0.6648
## Phase 2 Subphase 1 Iteration 3 Progress: 12%
## Phase 2 Subphase 1 Iteration 4 Progress: 12%
## theta -2.74477 2.45664 0.61227 -0.03246 -0.00944 0.57302
## ac 0.40180 -0.00272 0.15045 -1.20007 -1.00926 0.55683
## Phase 2 Subphase 1 Iteration 5 Progress: 12%
## Phase 2 Subphase 1 Iteration 6 Progress: 12%
## theta -2.73622 2.43253 0.62459 -0.01128 -0.00856 0.60414
## ac 0.143 -0.137 0.155 -1.418 -1.053 1.419
## Phase 2 Subphase 1 Iteration 7 Progress: 12%
## Phase 2 Subphase 1 Iteration 8 Progress: 12%
## theta -2.759 2.430 0.634 -0.038 0.005 0.700
## ac 0.187 -0.209 0.126 -0.745 -0.995 0.750
## Phase 2 Subphase 1 Iteration 9 Progress: 12%
## Phase 2 Subphase 1 Iteration 10 Progress: 12%
## theta -2.7456 2.4369 0.6306 -0.1035 0.0799 0.6651
## ac 0.5008 0.0322 0.2302 -0.2226 -0.6152 0.3654
## theta -2.75339 2.46982 0.63906 -0.08312 -0.00707 0.62042
## ac -0.46908 -0.14838 -0.00779 -0.04034 -0.16157 -0.08252
## theta: -2.75339 2.46982 0.63906 -0.08312 -0.00707 0.62042
##
## Start phase 2.2
## Phase 2 Subphase 2 Iteration 1 Progress: 20%
## Phase 2 Subphase 2 Iteration 2 Progress: 20%
## Phase 2 Subphase 2 Iteration 3 Progress: 20%
## Phase 2 Subphase 2 Iteration 4 Progress: 20%
## Phase 2 Subphase 2 Iteration 5 Progress: 20%
## Phase 2 Subphase 2 Iteration 6 Progress: 20%
## Phase 2 Subphase 2 Iteration 7 Progress: 20%
## Phase 2 Subphase 2 Iteration 8 Progress: 20%
## Phase 2 Subphase 2 Iteration 9 Progress: 20%
## Phase 2 Subphase 2 Iteration 10 Progress: 20%
## theta -2.69628 2.42721 0.61791 -0.08839 -0.00232 0.63322
## ac -0.1884 -0.0708 -0.0932 -0.1093 0.0289 -0.0515
## theta: -2.69628 2.42721 0.61791 -0.08839 -0.00232 0.63322
##
## Start phase 2.3
## Phase 2 Subphase 3 Iteration 1 Progress: 29%
## Phase 2 Subphase 3 Iteration 2 Progress: 29%
## Phase 2 Subphase 3 Iteration 3 Progress: 29%
## Phase 2 Subphase 3 Iteration 4 Progress: 29%
## Phase 2 Subphase 3 Iteration 5 Progress: 29%
## Phase 2 Subphase 3 Iteration 6 Progress: 29%
## Phase 2 Subphase 3 Iteration 7 Progress: 29%
## Phase 2 Subphase 3 Iteration 8 Progress: 29%
## Phase 2 Subphase 3 Iteration 9 Progress: 29%
## Phase 2 Subphase 3 Iteration 10 Progress: 29%
## theta -2.7034 2.4529 0.6095 -0.0704 -0.0396 0.6493
## ac 0.0034 0.0238 0.0234 -0.0878 -0.0737 0.0519
## theta: -2.7034 2.4529 0.6095 -0.0704 -0.0396 0.6493
##
## Start phase 2.4
## Phase 2 Subphase 4 Iteration 1 Progress: 43%
## Phase 2 Subphase 4 Iteration 2 Progress: 43%
## Phase 2 Subphase 4 Iteration 3 Progress: 43%
## Phase 2 Subphase 4 Iteration 4 Progress: 43%
## Phase 2 Subphase 4 Iteration 5 Progress: 43%
## Phase 2 Subphase 4 Iteration 6 Progress: 43%
## Phase 2 Subphase 4 Iteration 7 Progress: 43%
## Phase 2 Subphase 4 Iteration 8 Progress: 43%
## Phase 2 Subphase 4 Iteration 9 Progress: 43%
## Phase 2 Subphase 4 Iteration 10 Progress: 43%
## theta -2.6905 2.4359 0.6068 -0.0713 -0.0248 0.6544
## ac 0.08919 0.02687 0.00197 -0.03270 0.00559 -0.01861
## theta: -2.6905 2.4359 0.6068 -0.0713 -0.0248 0.6544
##
## Start phase 3
## Phase 3 Iteration 500 Progress 83%
## Phase 3 Iteration 1000 Progress 100%
To test if the model replicates data well, we have to define what we want to look at.
The in- and out-degree distributions are already provided in the GOF routine
(gofi0 <- sienaGOF(s50.hom.ans, IndegreeDistribution, verbose=TRUE, join=TRUE,
varName="friendship"))
## Detected 1000 iterations and 1 group.
## Calculating auxiliary statistics for periods 1 2 .
## Period 1
## > Completed 100 calculations > Completed 200 calculations > Completed 300 calculations > Completed 400 calculations > Completed 500 calculations > Completed 600 calculations > Completed 700 calculations > Completed 800 calculations > Completed 900 calculations > Completed 1000 calculations > Completed 1000 calculations
## Period 2
## > Completed 100 calculations > Completed 200 calculations > Completed 300 calculations > Completed 400 calculations > Completed 500 calculations > Completed 600 calculations > Completed 700 calculations > Completed 800 calculations > Completed 900 calculations > Completed 1000 calculations > Completed 1000 calculations
## Siena Goodness of Fit ( IndegreeDistribution ), all periods
## =====
## Monte Carlo Mahalanobis distance test p-value: 0.451
## -----
## One tailed test used (i.e. estimated probability of greater distance than observation).
## -----
## Calculated joint MHD = ( 11.15 ) for current model.
(gofo0 <- sienaGOF(s50.hom.ans, OutdegreeDistribution, verbose=TRUE, join=TRUE,
levls=c(0:10,15,20),varName="friendship"))
## Detected 1000 iterations and 1 group.
## Calculating auxiliary statistics for periods 1 2 .
## Period 1
## > Completed 100 calculations > Completed 200 calculations > Completed 300 calculations > Completed 400 calculations > Completed 500 calculations > Completed 600 calculations > Completed 700 calculations > Completed 800 calculations > Completed 900 calculations > Completed 1000 calculations > Completed 1000 calculations
## Period 2
## > Completed 100 calculations > Completed 200 calculations > Completed 300 calculations > Completed 400 calculations > Completed 500 calculations > Completed 600 calculations > Completed 700 calculations > Completed 800 calculations > Completed 900 calculations > Completed 1000 calculations > Completed 1000 calculations
## Siena Goodness of Fit ( OutdegreeDistribution ), all periods
## =====
## Monte Carlo Mahalanobis distance test p-value: 0.468
## **Note: Only 11 statistics are necessary in the auxiliary function.
## -----
## One tailed test used (i.e. estimated probability of greater distance than observation).
## -----
## Calculated joint MHD = ( 15.04 ) for current model.
(gof0.tc <- sienaGOF(s50.hom.ans, TriadCensus, verbose=TRUE, join=TRUE,
varName="friendship"))
## Detected 1000 iterations and 1 group.
## Calculating auxiliary statistics for periods 1 2 .
## Period 1
## > Completed 100 calculations > Completed 200 calculations > Completed 300 calculations > Completed 400 calculations > Completed 500 calculations > Completed 600 calculations > Completed 700 calculations > Completed 800 calculations > Completed 900 calculations > Completed 1000 calculations > Completed 1000 calculations
## Period 2
## > Completed 100 calculations > Completed 200 calculations > Completed 300 calculations > Completed 400 calculations > Completed 500 calculations > Completed 600 calculations > Completed 700 calculations > Completed 800 calculations > Completed 900 calculations > Completed 1000 calculations > Completed 1000 calculations
## Siena Goodness of Fit ( TriadCensus ), all periods
## =====
## Monte Carlo Mahalanobis distance test p-value: 0.004
## **Note: Only 15 statistics are necessary in the auxiliary function.
## -----
## One tailed test used (i.e. estimated probability of greater distance than observation).
## -----
## Calculated joint MHD = ( 70.48 ) for current model.
Plotting routines are defined for the GOF distributions. For example in-degree distribution
plot( gof0.tc ,scale=TRUE)
We are now going to use drinking alcohol as both a changing covariate and a dependent variable.
Since the alcohol variable is a dependent variable, we use
sienaDependent to format it (print to screen to get a quick
view of the information stored)
drink <- s50a
drinking <- sienaDependent( drink, type = "behavior" )
We now have a dataset with two dependent variables so we need to
create a new siena data object using sienaDataCreate:
s50data.infl <- sienaDataCreate( friendship, smoke1, drinking )
With an additional variable, additional effects are available (NB:)
s50.effects.infl <- getEffects(s50data.infl)
The dependent variable drinking can act as a covariate
on the network. We are going to use the ‘complicated’, five-parameter,
homophily parametrisation of Lomi and Snijder (2019):
s50.effects.infl <- includeEffects( s50.effects.infl,# again, 'adding' to the effects
egoX,# sender effect for alcohol
#egoSqX,# non-linear sender effect
altX,# receiver effect for alcohol
# altSqX,# non-linear receiver effect
# diffSqX,# squared difference of alcohol ego and alcohol alter
egoXaltX,
name = 'friendship',
interaction1 = "drinking" # DV works the same as an IV on another DV
# include =FALSE
)
## effectName shortName include fix test initialValue parm
## 1 drinking alter altX TRUE FALSE FALSE 0 0
## 2 drinking ego egoX TRUE FALSE FALSE 0 0
## 3 drinking ego x drinking alter egoXaltX TRUE FALSE FALSE 0 0
The dependent variable drinking can act as a dependent
variable for other variables, including the other dependent
variables.
s50.effects.infl <- includeEffects( s50.effects.infl,# still augmenting effects structure
avAlt,# this is the shortname for "average alter"
name="drinking",# name: what is the dependent variable
interaction1 = "friendship" # the network is now "independent" variable
)
## effectName shortName include fix test initialValue parm
## 1 drinking average alter avAlt TRUE FALSE FALSE 0 0
We use the same social selection effects for smoking, the sender effect
s50.effects.infl <- includeEffects( s50.effects.infl,# we "add and effect" to our effects object
egoX,# the shortname here evokes that variable for 'ego' affects decision
interaction1 = "smoke1" # the variable we want to interact the DV with
)
and the receiver effect
s50.effects.infl <- includeEffects( s50.effects.infl,
altX,# the shortname here evokes that variable for 'alter' affects decision of ego
interaction1 = "smoke1" # the variable we want to interact the DV with
)
and the homophily effect
s50.effects.infl <- includeEffects( s50.effects.infl ,# we "add and effect" to our effects object
egoXaltX,# the shortname
interaction1 = "smoke1" # the variable we want to interact the DV with
)
We will use two (2) different transitivity effects
s50.effects.infl <- includeEffects( s50.effects.infl, transTrip, transRecTrip )
## effectName shortName include fix test initialValue
## 1 transitive triplets transTrip TRUE FALSE FALSE 0
## 2 transitive recipr. triplets transRecTrip TRUE FALSE FALSE 0
## parm
## 1 0
## 2 0
Set up estimation settings:
s50.algo.infl<- sienaAlgorithmCreate( projname = 's50_infl' )
And we can estimate the model (it will take a little longer)
s50.est.infl <- siena07( s50.algo.infl,
data = s50data.infl,
effects = s50.effects.infl,
batch = TRUE,
returnDeps = TRUE )
We view the ANOVA table as before:
summary(s50.est.infl)
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
## Network Dynamics
## 1. rate constant friendship rate (period 1) 6.3705 ( 1.0095 ) 0.0689
## 2. rate constant friendship rate (period 2) 5.0700 ( 0.7856 ) 0.0088
## 3. eval outdegree (density) -2.8608 ( 0.1462 ) 0.0552
## 4. eval reciprocity 2.7398 ( 0.2717 ) 0.0794
## 5. eval transitive triplets 0.8971 ( 0.1516 ) 0.1141
## 6. eval transitive recipr. triplets -0.5488 ( 0.2416 ) 0.1188
## 7. eval smoke1 alter -0.0545 ( 0.2182 ) -0.0406
## 8. eval smoke1 ego -0.0439 ( 0.2163 ) -0.0043
## 9. eval smoke1 ego x smoke1 alter 0.3151 ( 0.3745 ) 0.0126
## 10. eval drinking alter -0.0559 ( 0.1047 ) -0.0188
## 11. eval drinking ego 0.0265 ( 0.1176 ) 0.0051
## 12. eval drinking ego x drinking alter 0.1637 ( 0.0789 ) -0.0318
##
## Behavior Dynamics
## 13. rate rate drinking (period 1) 1.3323 ( 0.3838 ) 0.0533
## 14. rate rate drinking (period 2) 1.7826 ( 0.4598 ) 0.0000
## 15. eval drinking linear shape 0.4145 ( 0.2152 ) -0.0171
## 16. eval drinking quadratic shape -0.5541 ( 0.2971 ) -0.0233
## 17. eval drinking average alter 1.1869 ( 0.7233 ) -0.0389
##
## Overall maximum convergence ratio: 0.2069
##
##
## Total of 3315 iteration steps.
##
## Covariance matrix of estimates (correlations below diagonal)
##
## 1.019 -0.031 0.024 0.000 -0.020 0.017 -0.005 0.016 -0.002 -0.001 0.008 -0.019 -0.010 -0.008 -0.001 -0.004 0.024
## -0.039 0.617 0.010 0.013 -0.012 0.005 -0.009 0.005 0.005 0.003 0.003 -0.009 0.003 -0.042 0.011 -0.026 0.060
## 0.161 0.090 0.021 -0.027 -0.015 0.019 0.003 0.001 0.000 0.000 0.000 -0.001 0.000 -0.006 0.001 -0.005 0.014
## 0.001 0.063 -0.683 0.074 0.024 -0.045 0.002 0.005 -0.012 -0.002 0.002 -0.004 -0.008 0.005 -0.001 0.005 -0.016
## -0.128 -0.104 -0.693 0.577 0.023 -0.032 -0.001 0.000 -0.006 0.001 -0.001 0.000 0.001 0.004 -0.002 0.005 -0.013
## 0.071 0.027 0.537 -0.685 -0.879 0.058 0.000 -0.003 0.012 0.000 0.001 0.000 -0.002 -0.003 0.003 -0.005 0.011
## -0.022 -0.051 0.087 0.038 -0.024 -0.009 0.048 -0.006 -0.014 -0.011 0.001 -0.004 -0.005 0.000 0.001 0.000 -0.002
## 0.074 0.028 0.028 0.083 0.007 -0.060 -0.122 0.047 -0.015 0.000 -0.010 -0.001 0.003 -0.008 0.001 0.005 -0.009
## -0.005 0.016 0.005 -0.115 -0.109 0.129 -0.176 -0.186 0.140 0.003 0.001 -0.005 0.008 0.001 -0.003 -0.003 0.010
## -0.006 0.031 -0.009 -0.076 0.034 -0.004 -0.485 -0.009 0.086 0.011 -0.006 0.002 0.002 0.003 -0.006 0.005 -0.012
## 0.067 0.036 -0.012 0.064 -0.057 0.020 0.041 -0.405 0.014 -0.509 0.014 -0.001 0.002 0.001 0.004 -0.005 0.012
## -0.244 -0.145 -0.127 -0.172 -0.021 0.019 -0.226 -0.060 -0.184 0.206 -0.126 0.006 0.002 0.003 -0.001 -0.001 0.000
## -0.026 0.011 0.003 -0.073 0.021 -0.020 -0.060 0.035 0.055 0.038 0.051 0.067 0.147 0.009 -0.015 -0.006 0.021
## -0.018 -0.116 -0.087 0.039 0.053 -0.028 -0.002 -0.081 0.007 0.071 0.016 0.090 0.053 0.211 -0.019 0.010 -0.012
## -0.002 0.065 0.033 -0.016 -0.072 0.059 0.031 0.016 -0.037 -0.267 0.166 -0.037 -0.176 -0.193 0.046 -0.032 0.078
## -0.013 -0.111 -0.125 0.060 0.121 -0.065 0.003 0.074 -0.026 0.158 -0.150 -0.037 -0.050 0.072 -0.496 0.088 -0.200
## 0.033 0.105 0.134 -0.079 -0.122 0.065 -0.014 -0.056 0.038 -0.159 0.144 0.007 0.076 -0.036 0.499 -0.931 0.523
##
## Derivative matrix of expected statistics X by parameters:
##
## 10.846 0.000 0.906 1.086 15.195 14.139 0.105 -0.142 0.238 -2.091 -2.667 5.127 -0.023 0.000 0.223 -0.871 -1.106
## 0.000 12.202 2.265 1.958 18.335 15.272 0.587 0.454 0.115 0.236 -0.505 3.243 0.000 0.109 0.089 0.139 -1.245
## 50.764 33.077 363.379 280.452 817.175 634.312 5.228 7.711 24.866 -22.467 -8.682 194.833 0.094 0.025 7.694 12.248 18.774
## 13.352 5.209 145.308 150.641 327.163 292.031 1.142 2.010 11.058 -9.837 -11.541 86.260 1.554 -1.051 2.148 7.987 12.252
## 78.274 36.567 467.519 392.247 1716.735 1395.206 25.056 25.298 38.799 23.171 34.757 298.165 -1.232 -1.219 6.623 9.016 18.791
## 30.848 13.941 240.590 232.190 901.857 806.177 11.574 12.825 19.474 9.118 12.836 156.025 -0.026 -0.531 1.532 7.808 12.790
## 1.312 3.918 7.408 8.703 54.431 49.493 51.109 33.330 10.437 84.991 66.843 28.500 0.674 -0.422 5.558 5.595 5.425
## -0.404 1.073 7.095 6.704 48.954 41.237 34.356 48.491 11.062 70.960 79.591 25.019 -1.239 -0.013 -0.478 -0.666 0.257
## 5.374 2.555 24.859 22.295 77.236 63.628 11.567 11.236 14.344 14.311 15.673 38.107 -0.536 -0.647 0.862 2.430 1.999
## -21.260 3.557 -3.908 1.955 35.462 40.728 87.137 71.277 11.308 342.080 243.785 -22.552 3.663 0.275 34.561 8.804 13.343
## -9.727 7.338 27.218 21.179 137.570 103.404 66.465 72.435 15.798 227.581 283.603 31.627 -1.532 -0.686 1.391 10.853 2.079
## 45.419 40.415 235.079 211.320 644.574 533.100 44.438 43.392 39.529 20.076 30.548 516.077 -4.757 -5.033 -0.166 42.562 32.474
## 0.887 0.000 -0.483 -0.284 -3.570 -1.985 -0.288 -1.579 -1.002 -1.041 -3.114 -7.078 12.408 0.000 7.754 -2.915 -1.760
## 0.000 1.577 -1.763 -1.811 -4.193 -3.517 0.208 0.699 -0.210 -0.632 -1.088 -9.913 0.000 10.688 6.356 -1.702 1.035
## 2.038 -0.151 3.057 1.883 7.457 4.205 5.910 3.852 1.224 22.804 17.470 1.962 6.344 5.586 54.984 10.609 9.344
## 6.777 4.059 38.413 35.739 89.866 75.436 -0.313 1.495 4.255 -5.186 -4.875 64.773 -0.256 -2.799 1.063 132.919 85.777
## 1.894 0.716 16.074 15.982 46.370 39.873 0.138 0.764 1.876 -1.100 -2.579 25.645 -1.677 -2.158 -5.828 49.384 43.908
##
## Covariance matrix of X (correlations below diagonal):
##
## 127.357 -4.803 97.792 70.282 361.894 291.747 0.065 1.070 9.044 -42.721 -35.052 66.644 -0.538 -0.339 1.344 2.309 0.166
## -0.044 92.947 60.674 40.791 165.101 129.140 7.481 6.879 3.757 19.577 12.953 47.855 1.008 -0.208 1.694 3.269 -1.283
## 0.352 0.256 605.792 483.090 1681.321 1325.372 15.742 17.228 49.487 -21.047 -6.461 396.256 -5.082 -2.904 5.718 45.670 59.619
## 0.296 0.201 0.934 441.195 1414.677 1174.390 14.641 16.835 43.830 -9.289 -6.038 341.319 -4.702 -1.758 4.488 41.657 55.495
## 0.394 0.211 0.840 0.828 6611.582 5373.220 87.404 79.009 166.424 86.047 109.527 1237.253 -10.793 -0.494 36.679 103.787 151.450
## 0.384 0.199 0.799 0.830 0.981 4541.249 74.657 68.447 137.677 77.211 79.358 1005.310 -10.302 1.542 28.646 87.123 128.165
## 0.001 0.088 0.073 0.079 0.123 0.126 76.915 63.383 21.560 146.232 128.784 54.889 2.876 1.491 14.198 3.953 4.746
## 0.011 0.084 0.082 0.094 0.114 0.119 0.846 72.926 21.671 130.926 133.459 60.928 1.123 1.431 10.092 6.468 4.855
## 0.168 0.082 0.422 0.438 0.430 0.429 0.516 0.533 22.708 29.871 32.047 74.107 -1.118 -0.815 1.793 8.474 9.227
## -0.160 0.086 -0.036 -0.019 0.045 0.049 0.707 0.650 0.266 556.513 450.570 -6.794 11.915 4.852 60.805 4.142 7.392
## -0.137 0.059 -0.012 -0.013 0.059 0.052 0.645 0.687 0.296 0.840 517.528 0.982 7.266 2.774 42.026 4.527 1.260
## 0.189 0.158 0.514 0.519 0.486 0.476 0.200 0.228 0.497 -0.009 0.001 980.870 -15.136 -17.776 -6.791 119.040 99.307
## -0.010 0.023 -0.045 -0.049 -0.029 -0.034 0.072 0.029 -0.051 0.111 0.070 -0.106 20.771 0.139 13.107 -3.330 -3.774
## -0.006 -0.005 -0.025 -0.018 -0.001 0.005 0.036 0.035 -0.036 0.043 0.026 -0.119 0.006 22.691 14.313 -0.615 1.634
## 0.012 0.018 0.024 0.022 0.047 0.044 0.168 0.122 0.039 0.267 0.191 -0.022 0.298 0.311 93.241 9.680 20.593
## 0.014 0.024 0.131 0.139 0.090 0.091 0.032 0.053 0.125 0.012 0.014 0.267 -0.051 -0.009 0.071 202.139 132.697
## 0.001 -0.010 0.189 0.206 0.145 0.148 0.042 0.044 0.151 0.024 0.004 0.247 -0.065 0.027 0.166 0.727 164.624
Make a nicer listing of the results:
siena.table(s50.est.infl, type="html", sig=TRUE)
## Results for s50.est.infl written to s50.est.infl.html .
Let us explore more effects using the s50 data set.
Instead of using smoking as a constant covariate, we will define smoking as yet another dependent variable
smoke <- s50s
Define dependent and independent variables
# format the dependent variables:
friendshipData <- array( c( friend.data.w1,
friend.data.w2,
friend.data.w3 ),
dim = c( 50, 50, 3 ) )
friendship <- sienaDependent(friendshipData)
drinking <- sienaDependent( drink, type = "behavior" )
smoking <- sienaDependent( smoke, type = "behavior" )
s50.big.data <- sienaDataCreate( friendship, smoking, drinking )
# chose effects
s50.big.effs <- getEffects( s50.big.data )
The effects avaiable to us are in the object
s50.big.effs. Have a look!
The most important columns are:
| name | effectName | shortName | interaction1 | interaction2 | include | fix | test |
|---|---|---|---|---|---|---|---|
| Dependent variable that owns effect | description | what you use to define model | covariate 1 | covariate 2 | in model, T/F | T/F | T /F |
The manual has a comprehensive list of possible effects
Let us examine only the structural part of the previous model
s50.big.effs <- includeEffects( s50.big.effs, transTrip, transRecTrip )
## effectName shortName include fix test initialValue
## 1 transitive triplets transTrip TRUE FALSE FALSE 0
## 2 transitive recipr. triplets transRecTrip TRUE FALSE FALSE 0
## parm
## 1 0
## 2 0
Set up estimation settings:
s50.algo.struc.1<- sienaAlgorithmCreate( projname = 's50_struc.1' )
And estimate
s50.est.struc.1 <- siena07( s50.algo.struc.1,
data = s50.big.data ,
effects =s50.big.effs ,
batch = TRUE,
returnDeps = TRUE )
Check results
summary( s50.est.struc.1 )
What happens if we substitute transTies for
transRecTrip
s50.big.effs <- includeEffects( s50.big.effs,transRecTrip,include=FALSE )
## effectName shortName include fix test initialValue
## 1 transitive recipr. triplets transRecTrip FALSE FALSE FALSE 0
## parm
## 1 0
s50.big.effs <- includeEffects( s50.big.effs, transTies )
## effectName shortName include fix test initialValue parm
## 1 transitive ties transTies TRUE FALSE FALSE 0 0
And estimate
s50.est.struc.2 <- siena07( s50.algo.struc.1,
data = s50.big.data ,
effects =s50.big.effs ,
batch = TRUE,
returnDeps = TRUE )
summary( s50.est.struc.2 )
Include all three
s50.big.effs <- includeEffects( s50.big.effs,transRecTrip,include=TRUE )
## effectName shortName include fix test initialValue
## 1 transitive recipr. triplets transRecTrip TRUE FALSE FALSE 0
## parm
## 1 0
And estimate
s50.est.struc.3 <- siena07( s50.algo.struc.1,
data = s50.big.data ,
effects =s50.big.effs ,
batch = TRUE,
returnDeps = TRUE )
Check results
summary( s50.est.struc.3 )
An alternative to having both transTies and
transTrip is to use GWESP
s50.big.effs <- includeEffects( s50.big.effs,transTies,transTrip,include=FALSE )
## effectName shortName include fix test initialValue parm
## 1 transitive triplets transTrip FALSE FALSE FALSE 0 0
## 2 transitive ties transTies FALSE FALSE FALSE 0 0
s50.big.effs <- includeEffects( s50.big.effs,gwespFF)
## effectName shortName include fix test initialValue parm
## 1 GWESP I -> K -> J (#) gwespFF TRUE FALSE FALSE 0 69
s50.est.struc.4 <- siena07( s50.algo.struc.1,
data = s50.big.data ,
effects =s50.big.effs ,
batch = TRUE,
returnDeps = TRUE )
Check results
summary( s50.est.struc.4 )
What differences do we see between the different parametrizations?
Looking at s50.big.effs, all effects with
name=friendship and interaction1=smoking are
social selection effects with respect to smoking.
s50.effects.infl <- s50.big.effs
s50.effects.infl <- includeEffects( s50.effects.infl,# again, 'adding' to the effects
egoX,# sender effect for alcohol
egoSqX,# non-linear sender effect
altX,# receiver effect for alcohol
altSqX,# non-linear receiver effect
diffSqX,# squared difference of alcohol ego and alcohol alter
interaction1 = "drinking" # DV works the same as an IV on another DV
)
## effectName shortName include fix test initialValue parm
## 1 drinking alter altX TRUE FALSE FALSE 0 0
## 2 drinking squared alter altSqX TRUE FALSE FALSE 0 0
## 3 drinking ego egoX TRUE FALSE FALSE 0 0
## 4 drinking squared ego egoSqX TRUE FALSE FALSE 0 0
## 5 drinking diff. squared diffSqX TRUE FALSE FALSE 0 0
Looking at s50.big.effs, all effects with
name=friendship and interaction1=smoking are
social selection effects with respect to drinking.
s50.effects.infl <- includeEffects( s50.effects.infl,# we "add and effect" to our effects object
egoX,# the shortname here evokes that variable for 'ego' affects decision
interaction1 = "smoking" # the variable we want to interact the DV with
)
## effectName shortName include fix test initialValue parm
## 1 smoking ego egoX TRUE FALSE FALSE 0 0
Change or add a covariate effect on friendship of smoking
Estimate!
s50.est.infl.1 <- siena07( s50.algo.struc.1,
data = s50.big.data ,
effects =s50.effects.infl,
batch = TRUE,
returnDeps = TRUE )
Check results
summary( s50.est.infl.1 )
As before we specify
s50.effects.infl.2 <- s50.big.effs
s50.effects.infl.2 <- includeEffects( s50.effects.infl.2,# still augmenting effects structure
avAlt,# this is the shortname for "average alter"
name="drinking",# name: what is the dependent variable
interaction1 = "friendship" # the network is now "independent" variable
)
## effectName shortName include fix test initialValue parm
## 1 drinking average alter avAlt TRUE FALSE FALSE 0 0
But we can also add influence for smoking but let us try
totAlt instead
s50.effects.infl.2 <- includeEffects( s50.effects.infl.2,# still augmenting effects structure
totAlt,# this is the shortname for "average alter"
name="smoking",# name: what is the dependent variable
interaction1 = "friendship" # the network is now "independent" variable
)
## effectName shortName include fix test initialValue parm
## 1 smoking total alter totAlt TRUE FALSE FALSE 0 0
The dependent variable can also be predictors of each other. Smoking as DV
s50.effects.infl.2 <- includeEffects( s50.effects.infl.2,# still augmenting effects structure
effFrom,# the effect of 'interaction1' on smoking
name="smoking",# name: what is the dependent variable
interaction1 = "drinking"# drinking is now "independent" variable
)
## effectName shortName include fix test initialValue parm
## 1 smoking: effect from drinking effFrom TRUE FALSE FALSE 0 0
Drinking as DV
s50.effects.infl.2 <- includeEffects( s50.effects.infl.2,# still augmenting effects structure
effFrom,# the effect of 'interaction1' on smoking
name="drinking",# name: what is the dependent variable
interaction1 = "smoking"# drinking is now "independent" variable
)
## effectName shortName include fix test initialValue parm
## 1 drinking: effect from smoking effFrom TRUE FALSE FALSE 0 0
The effect of the total similarity of an egos alters drinking on the smoking of ego (smoking tot. sim. (friendship) x alter’s drinking) requires two interactions
s50.effects.infl.2 <- includeEffects( s50.effects.infl.2,# still augmenting effects structure
totSimAltX,# the effect of 'interaction1' on smoking
name="smoking",# name: what is the dependent variable
interaction1 = "drinking",# drinking is now "independent" variable
interaction2 = "friendship"
)
## effectName shortName include fix
## 1 smoking: tot. sim. (friendship) x alter's drinking totSimAltX TRUE FALSE
## test initialValue parm
## 1 FALSE 0 0
Estimate!
s50.est.infl.2 <- siena07( s50.algo.struc.1,
data = s50.big.data ,
effects =s50.effects.infl.2,
batch = TRUE,
returnDeps = TRUE )
Check results
summary( s50.est.infl.2 )
What if we want to test if we should include totSimAltX
for drinking?
The we add the effect but set test=TRUE and
fix=TRUE
s50.effects.infl.3 <- s50.effects.infl.2
s50.effects.infl.3 <- includeEffects( s50.effects.infl.3,# still augmenting effects structure
totSimAltX,# the effect of 'interaction1' on smoking
name="drinking",# name: what is the dependent variable
interaction1 = "smoking",# drinking is now "independent" variable
interaction2 = "friendship" ,
test=TRUE, fix=TRUE
)
## effectName shortName include fix
## 1 drinking: tot. sim. (friendship) x alter's smoking totSimAltX TRUE TRUE
## test initialValue parm
## 1 TRUE 0 0
The parameter is now fixed to 0
Estimate the model withtotSimAltX for drinking set to
0
Estimate!
(s50.est.infl.3 <- siena07( s50.algo.struc.1,
data = s50.big.data ,
effects =s50.effects.infl.3,
batch = TRUE,
returnDeps = TRUE ))
We can now test \[ H_0: \beta_{totSimAltX}=0 \] against
\[ H_1: \beta_{totSimAltX}\neq0 \]
using a Score-type test:
score.Test(s50.est.infl.3)
## Tested effects:
## drinking: drinking: tot. sim. (friendship) x alter's smoking eval
## chi-squared = 2.15, d.f. = 1; one-sided Z = -1.47; two-sided p = 0.143.
This test only uses the simulated statistics for the effect and there is no need to actually estimate the parameter
What if my model does not converge?
If converge t-statistics are close to \(0.1\) in absolute value, restart using
prevAns
Assume that we did not have enough iterations in Phase 3 and only 2 subphases in Phase 2
s50.algo.struc.short<- sienaAlgorithmCreate( projname = 's50_struc.1', n3 =50, nsub=2)
Estimate!
(s50.est.infl.3 <- siena07( s50.algo.struc.short,
data = s50.big.data ,
effects =s50.effects.infl.2,
batch = TRUE,
returnDeps = TRUE ))